Skip to content

refactor(core): model adaptor#2480

Merged
EAGzzyCSL merged 2 commits into
mainfrom
zzy/model-adaptor
Jun 4, 2026
Merged

refactor(core): model adaptor#2480
EAGzzyCSL merged 2 commits into
mainfrom
zzy/model-adaptor

Conversation

@EAGzzyCSL

Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a model adapter pattern that centralizes per-model-family behavior (JSON parsing, chat completion params, image preprocessing, planning, locate, reasoning) into a single registry. It also restructures packages/core/src/ai-model into models/, prompts/, shared/, and workflows/ subtrees, and pulls bbox normalization out of common.ts into a dedicated locate-result adapter framework. Service-level deep-locate flow is rebuilt around adapter.locate.supportsSearchArea and a new resolveLocateSearchArea helper. The PR title is marked "wip".

Changes:

  • Introduce ResolvedModelAdapter plus per-family adapters (qwen, doubao, gemini, gpt, glm, auto-glm, ui-tars) and route planning/locate/reasoning/image-detail through them.
  • Replace adaptBbox* / pointToBbox / fillBboxParam / bboxDescription with a LocateResultAdapter (extractRawLocateResultresolveLocateResultnormalizeResultToPixelBbox) and pixel mapping helpers; rename generateElementByRect/Point to createLocateResultElementFromRect/Point and move to @/locate-result-element.
  • Restructure ai-model directory (prompt/prompts/, new workflows/{inspect,planning,generation,image-preprocess}, new shared/{json,model-locate-result}); rewrite locate/section-locate/planning around prepareModelImage and buildSearchAreaConfig; service now uses first-pass locate when a model does not support search-area locate, and ServiceDump.matchedElement becomes a single optional element.

Reviewed changes

Copilot reviewed 107 out of 112 changed files in this pull request and generated no comments.

Show a summary per file
File Description
packages/core/src/ai-model/models/** New model adapter registry, types, ResolvedModelAdapter, per-family adapters.
packages/core/src/ai-model/shared/{json,model-locate-result}/** Extracted JSON parser and locate-result adapter framework.
packages/core/src/ai-model/workflows/{inspect,planning,generation,image-preprocess}/** New workflow modules replacing prior inspect/llm-planning/prompt files.
packages/core/src/ai-model/prompts/** Renamed from prompt/; locate/section/planning prompts now use response-format descriptors.
packages/core/src/ai-model/service-caller/{reasoning,image-detail,client,error}.ts Reasoning resolution rewritten over adapters; image-detail helper removed; client/error extracted.
packages/core/src/ai-model/models/auto-glm/, models/ui-tars/ Auto-GLM and UI-TARS adapters moved under models/; prompts split into per-locale exports.
packages/core/src/service/index.ts Locate flow refactored around resolveLocateSearchArea, supportsSearchArea, single-element parseResult.
packages/core/src/agent/{agent,tasks,task-builder,utils}.ts Replanning limit and plan selection driven by adapter; uses new createLocateResultElementFromRect.
packages/core/src/{common,types,index,locate-result-element}.ts Removed bbox adapters from common; matchedElement is now optional single element; new public locate-result helpers.
packages/core/src/ai-model/auto-glm/{index,util}.ts Old auto-glm helpers/util removed.
packages/shared/src/extractor/{index,dom-util}.ts, src/img/transform.ts Removed generateElementByPoint/Rect and paddingImage option from cropByRect.
packages/playground/src/server.ts Switched to createLocateResultElementFromPoint from @midscene/core.
packages/evaluation/tests/llm-planning.test.ts Updated to adaptModelLocateResultToRect options shape.
packages/core/tests/** Large test refactor: imports updated to new module paths, new tests for adapters/image-preprocess/locate-result, removed adapt_bbox and generate-element-by-rect tests, updated reasoning/empty-content expectations.
packages/shared/tests/unit-test/iife-bundle.test.ts Removed generateElementByRect from required IIFE exports and documented usage status of the remaining exports.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented May 18, 2026

Copy link
Copy Markdown

Deploying midscene with  Cloudflare Pages  Cloudflare Pages

Latest commit: 5fda79f
Status: ✅  Deploy successful!
Preview URL: https://9dd2e5ac.midscene.pages.dev
Branch Preview URL: https://zzy-model-adaptor.midscene.pages.dev

View logs

@EAGzzyCSL EAGzzyCSL marked this pull request as draft May 18, 2026 05:08
@EAGzzyCSL EAGzzyCSL force-pushed the zzy/model-adaptor branch 2 times, most recently from 5f70754 to b0181d4 Compare May 29, 2026 09:01
@EAGzzyCSL EAGzzyCSL requested a review from Copilot May 29, 2026 09:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 130 out of 131 changed files in this pull request and generated 3 comments.

Comment thread packages/core/tests/unit-test/utils.test.ts
Comment thread packages/core/src/ai-model/prompt/llm-locator.ts
Comment thread packages/core/src/ai-model/inspect.ts Outdated
@EAGzzyCSL EAGzzyCSL force-pushed the zzy/model-adaptor branch 2 times, most recently from 906025d to 3f290c6 Compare June 1, 2026 13:31
@EAGzzyCSL EAGzzyCSL requested a review from Copilot June 1, 2026 14:50

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 134 out of 135 changed files in this pull request and generated 3 comments.

Comment thread packages/shared/src/env/parse-model-config.ts Outdated
Comment thread packages/core/src/ai-model/prompt/llm-locator.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 138 out of 139 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

packages/core/src/ai-model/models/ui-tars/planning.ts:46

  • pointToPixelBbox clamps right/bottom to width/height, which can produce locatedPixelBbox values that are outside the valid pixel index range (0..width-1 / 0..height-1). Since this bbox is consumed directly as screenshot coordinates, it can lead to out-of-bounds rects.
    packages/core/src/ai-model/models/auto-glm/actions.ts:34
  • autoGLMCoordinateToPixelBbox can return right === width / bottom === height when the model returns coordinates near 1000, which is outside the valid pixel index range (0..width-1 / 0..height-1). These bboxes are embedded into locatedPixelBbox and used directly downstream, so they should be clamped/scaled to width - 1 / height - 1.

Comment thread packages/core/src/ai-model/shared/model-locate-result/factory.ts Outdated
Comment thread packages/core/src/ai-model/shared/model-locate-result/factory.ts Outdated
Comment thread packages/core/src/ai-model/models/auto-glm/locate.ts Outdated
@EAGzzyCSL EAGzzyCSL force-pushed the zzy/model-adaptor branch from e25b70f to f46fcff Compare June 3, 2026 09:39
@EAGzzyCSL EAGzzyCSL marked this pull request as ready for review June 3, 2026 11:10
@EAGzzyCSL EAGzzyCSL changed the title wip: model adaptor refactor(core): model adaptor Jun 3, 2026
@EAGzzyCSL EAGzzyCSL force-pushed the zzy/model-adaptor branch from 70fd558 to 5fda79f Compare June 4, 2026 12:33
@EAGzzyCSL EAGzzyCSL merged commit e617acb into main Jun 4, 2026
16 checks passed
@EAGzzyCSL EAGzzyCSL deleted the zzy/model-adaptor branch June 4, 2026 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants